Skilled Deep Research — Post-Mortem

Written: 2026-03-07 | Run: cmmc-templates (CMMC 2.0 templates for small business)

TL;DR

The run produced 5 sources from 5 workers in ~39 minutes. That's 1 source per worker. A properly functioning run should produce 10-20 sources per worker (50-100 total). Actual yield: ~5-10% of expected. The skill architecture is sound but has multiple critical bugs.

What Actually Happened (Timeline)

Time (UTC)	Event
05:30	Ada spawned orchestrator
05:30–05:35	Orchestrator spawned 5-6 workers in parallel
05:30–05:45	Workers each fetched 1-2 URLs then went silent
06:09	Orchestrator declared "complete", merged 5 sources
06:09	Report written — only 1 source per worker

Bug #1 — CRITICAL: Workers can't spawn (agentId missing)

What happened: The resume orchestrator failed immediately with:

"error": "ACP target agent is not configured. Pass `agentId` in `sessions_spawn` or set `acp.defaultAgent` in config."

Root cause: The worker prompt template in SKILL.md calls sessions_spawn without an agentId. Sub-agents (depth ≥ 1) can't inherit the default agent from config — they need it explicitly passed.

Evidence: The resume orchestrator (c17c0e28) died after 8 lines — spawned, tried to spawn workers, got the ACP error, stopped.

This is almost certainly why original workers also failed — if the orchestrator spawned workers using the same broken prompt, all workers would fail to spawn. The 5 results we got may have been from the orchestrator itself fetching URLs directly (hallucinating worker output), or from a version of the prompt that omitted the spawn call and fetched inline.

Fix: Add agentId to every sessions_spawn call in orchestrator and worker prompts:

sessions_spawn({
  agentId: "ada",  // ← REQUIRED for sub-agents
  task: "...",
  runtime: "subagent",
  ...
})

The agentId needs to be threaded from Ada → orchestrator prompt → worker prompts at spawn time.

Bug #2 — CRITICAL: known-urls.txt not being updated

What happened: After 5 workers fetched multiple URLs, known-urls.txt contained exactly 1 URL.

Expected: Every worker should append fetched URLs to known-urls.txt for deduplication.

Root cause: The worker prompt says to "Append URL to known-urls.txt" but doesn't give the exact file path or exec command. Workers are inconsistent about whether they do this step. Also: if workers are actually running in-process in the orchestrator (due to Bug #1), they may not have write access to the right path.

Impact: No deduplication across workers. Workers could fetch the same URLs. Retry logic is also broken since known-urls.txt is the source of truth.

Fix: Give workers an explicit shell command:

echo "https://fetched-url.com" >> /home/sean/.openclaw/workspace-ada/skills-data/skilled-deep-research/[SLUG]/known-urls.txt

And verify the file exists and is writable at worker startup.

Bug #3 — HIGH: Community worker result data loss

What happened: community worker progress.json showed: - urls_fetched: 9 - findings: 9

But community-results.md contained only 1 source block.

Root cause: Workers are supposed to checkpoint (append to results.md) after every URL. The community worker either: 1. Buffered all results in memory and wrote at the end (crashed before writing), or 2. Overwrote instead of appended on a second pass

Fix: Enforce append-only writes in the worker prompt with explicit shell:

cat >> results.md << 'BLOCK'
### [score] [title](url)
...
---
BLOCK

Never write the whole file at once. Checkpoint after every single URL fetch.

Bug #4 — HIGH: gov worker found 0 URLs

What happened: gov worker progress showed urls_found: 0, urls_fetched: 0. It was stuck on https://csrc.nist.gov/pubs/sp/800/171/a/final with no results.

Root cause: IPv6 blocking on .gov sites. The SKILL.md documents this explicitly:

"Our LXC uses IPv6 which Akamai CDNs can block on .gov/.mil sites. Never use raw web_fetch or curl without -4 on government sites."

The worker prompt tells workers to use the fetch script (which forces -4) but web_search results don't auto-use it — the worker has to consciously call the fetch script for every URL. If a worker instead used web_fetch directly (the native tool), .gov fetches silently fail or return bot-block pages.

Evidence: gov worker shows urls_found: 0 — meaning even the search returned nothing actionable, or the worker couldn't parse the results before stalling.

Fix: 1. Add explicit validation step at worker start: verify the fetch script exists and returns 200 for a test URL 2. Add to worker prompt: "DO NOT use the web_fetch tool for any URL. ONLY use the fetch script." 3. Consider a pre-flight search test before committing to the URL list

Bug #5 — MEDIUM: Binary file fetch (UnicodeDecodeError)

What happened: retry-queue.md contained:

- https://media.armis.com/raw/upload/cmmc-rfp-template.docx — reason: UnicodeDecodeError: 'utf-8' codec can't decode byte 0xf0

Root cause: The fetch script returns raw binary for .docx/.pdf files. Workers try to decode as UTF-8 text, fail, and log to retry queue. The retry worker would have the same problem.

Fix: Detect binary content-type before fetching full body. If Content-Type is application/vnd.openxmlformats or application/pdf, just log the direct download URL — don't try to read the content. The existence of a direct download link is the finding, not the file contents.

Bug #6 — MEDIUM: Orchestrator declared complete too fast

What happened: The orchestrator merged results at 06:09 — only ~39 minutes after workers spawned. The workers were showing phase: fetching in their progress files at that point. The orchestrator didn't wait for completion signals.

Root cause: A2A signaling (workers → orchestrator) depends on sessions_send. If workers are broken (Bug #1), they never send WORKER_COMPLETE signals. The orchestrator's fallback is to poll progress files every 120s for up to 15 cycles (30 minutes max). After 15 cycles with no progress, it moves on — even if workers are stalled mid-fetch.

Impact: Orchestrator synthesized partial results and declared success.

Fix: Add a minimum threshold check before synthesis: "If fewer than 3 workers sent WORKER_COMPLETE and total findings < 10, do NOT synthesize — log a failure and alert Ada instead."

Bug #7 — LOW: merge-reports.py parses results with regex, brittle

What happened: The report's source quality scores are all listed as [2/5] despite the underlying worker results showing [5/5] for the NIST templates. The merge script likely failed to parse the format correctly.

Root cause: merge-reports.py uses regex on the results markdown. Any formatting deviation (missing blank line, slightly different header) causes the parser to drop or misparse a source.

Fix: Switch to a more forgiving parser, or enforce results format with a schema validator workers run before writing. At minimum, add a format-check step to the worker prompt.

What Actually Worked

Skill architecture — orchestrator → workers → A2A signaling is the right design
Fetch script — when used correctly (fetch script vs raw web_fetch), it works
Checkpointing concept — progress.json files gave us enough telemetry to diagnose the failures
Deduplication concept — known-urls.txt is the right approach, just not implemented correctly
Search quality — the URLs workers found were relevant (cmmcaudit.org, GitHub, NIST) — the search step worked

Missing Capability: Site Crawling

Not a bug — a genuine missing feature. Current skill: - Fetches individual URLs surfaced by search queries - Does NOT follow links or traverse site structure

For deep research, this is a significant gap. Sites like cmmcaudit.org have a resource index page that links to 8+ template pages. Search engines only surface 1-2 of those. Without crawling, we miss 75%+ of available resources on resource-rich sites.

Proposed solution: A crawl.py helper script:

# Given a root URL + relevance keywords, extract and score internal links
# Return top N links sorted by relevance score (anchor text match)
# Respect: depth limit (2), domain boundary, already-known URLs

Workers call this when they land on a page that looks like a resource index (template, download, tools pages).

Priority Fix List

Priority	Bug	Effort	Impact
🔴 P0	Bug #1: agentId missing from sessions_spawn	Low — add one field	Workers can't spawn at all
🔴 P0	Bug #3: Result data loss (buffer vs append)	Low — change write pattern	80%+ of findings lost
🔴 P0	Bug #4: gov worker IPv6 block	Low — strengthen prompt	.gov sources completely inaccessible
🟠 P1	Bug #2: known-urls.txt not updated	Low — add explicit command	Dedup broken, retry logic broken
🟠 P1	Bug #6: Orchestrator declares complete too fast	Medium — add threshold check	False "success" on failed runs
🟡 P2	Bug #5: Binary file UnicodeDecodeError	Low — content-type check	Direct download links missed
🟡 P2	Bug #7: merge-reports.py brittle parser	Medium — improve parser	Source scores wrong in final report
🟢 P3	Missing: Site crawling	High — new script + prompt changes	10x more sources on resource-rich sites

Recommended Fix Order

Fix agentId (P0) — without this, nothing works
Fix append-only results writing (P0) — without this, findings are lost
Fix IPv6 / fetch script enforcement (P0) — .gov sources are highest quality
Fix known-urls.txt update (P1) — enables proper dedup and retry
Fix orchestrator completion threshold (P1) — prevents false success
Fix binary file handling (P2)
Fix merge parser (P2)
Build crawl capability (P3)

🕸️ Ada Research Browser